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Abstract 

Active Peer-to-Peer (P2P) worms present serious threats 
to the global Internet by exploiting popular P2P applica- 
tions to perform rapid topological self-propagation. Active 
P2P worms pose more deadly threats than normal scanning 
worms because they do not exhibit easily detectable anoma- 
lies, thus many existing defenses are no longer effective. 

We propose an immunity system with Phagocytes — a 
small subset of elected P2P hosts that are immune with high 
probability and specialized in finding and "eating" worms 
in the P2P overlay. The Phagocytes will monitor their 
managed P2P hosts' connection patterns and traffic vol- 
ume in an attempt to detect active P2P worm attacks. Once 
detected, local isolation, alert propagation and software 
patching will take place for containment. The Phagocytes 
further provide the access control and filtering mechanisms 
for communication establishment between the internal P2P 
overlay and the external hosts. We design a novel adaptive 
and interaction-based computational puzzle scheme at the 
Phagocytes to restrain external worms attacking the P2P 
overlay, without influencing legitimate hosts' experiences 
significantly. We implement a prototype system, and eval- 
uate its performance based on realistic massive-scale P2P 
network traces. The evaluation results illustrate that our 
Phagocytes are capable of achieving a total defense against 
active P2P worms. 



1 Introduction 

The ability to gain control of a huge amount of Internet 
hosts could be easily achieved by the exploitation of worms 
which self-propagate through popular Internet applications 
and services. Internet worms have akeady proven their ca- 
pability of inflicting massive-scale disruption and damage 
to the Internet infrastructure. These worms employ normal 
scanning as a strategy to find potential vulnerable targets, 
i.e., they randomly select victims from the IP address space. 
So far, there have been many existing schemes that are ef- 
fective in detecting such scanning worms 1 14|, e.g., by cap- 
turing the scanning events lITSi or by passively detecting 



abnormal network traffic activities ifTDl . 

In recent years, Peer-to-Peer (P2P) overlay applications 
have experienced an explosive growth, and now dominate 
large fractions of both the Internet users and traffic vol- 
ume [12J; thus, a new type of worms that leverage the popu- 
lar P2P overlay applications, called P2P worms, pose a very 
serious threat to the Internet \2\. Generally, the P2P worms 
can be grouped into two categories: passive P2P worms and 
active P2P worms. The passive P2P worm attack is gener- 
ally launched either by copying such worms into a few P2P 
hosts' shared folders with attractive names, or by partici- 
pating into the overlay and responding to queries with the 
index information of worms. Unable to identify the worm 
content, normal P2P hosts download these worms unsus- 
pectedly into their own shared folders, from which others 
may download later without being aware of the threat, thus 
passively contributing to the worm propagation. The pas- 
sive P2P worm attack could be mitigated by current patch- 
ing systems |28| and reputation models [25 1. In this paper, 
we focus on another serious P2P worm: active P2P worm. 

The active P2P worms could utilize the P2P overlay ap- 
plications to retrieve the information of a few vulnerable 
P2P hosts and then infect these hosts, or as an alterna- 
tive, these worms are directly released in a hit list of P2P 
hosts to bootstrap the worm infection. Since the active 
P2P worms have the capacity of gaining control of the in- 
fected P2P hosts, they could perform rapid topological self- 
propagation by spreading themselves to neighboring hosts, 
and in turn, spreading throughout the whole network to af- 
fect the quality of overlay service and finally cause the over- 
lay service to be unusable. The P2P overlay provides an ac- 
curate way for worms to find more vulnerable hosts easily 
without probing randomly selected IP addresses (i.e., low 
connection failure rate). Moreover, the worm attack traffic 
could easily blend into normal P2P traffic, so that the active 
P2P worms will be more deadly than scanning worms. That 
is, they do not exhibit easily detectable anomalies in the 
network traffic as scanning worms do, so many existing de- 
fenses against scanning worms are no longer effective |[32| . 

Besides the above internal infection in the P2P overlay, 
the infected P2P hosts could again mount attacks to external 
hosts. In similar sense, since the P2P overlay applications 



are pervasive on today's Internet, it is also attractive for ma- 
licious external hosts to mount attacks against the P2P over- 
lay applications and then employ them as an ideal platform 
to perform future massive-scale attacks, e.g., botnet attacks. 

In this paper, we aim to develop a holistic immunity sys- 
tem to provide the mechanisms of both internal defense and 
external protection against active P2P worms. In our sys- 
tem, we elect a small subset of P2P overlay nodes. Phago- 
cytes, which are immune with high probability and special- 
ized in finding and "eating" active P2P worms. Each Phago- 
cyte in the P2P overlay is assigned to manage a group of 
P2P hosts. These Phagocytes monitor their managed P2P 
hosts' connection patterns and traffic volume in an attempt 
to detect active P2P worm attacks. Once detected, the local 
isolation procedure will cut off the links of all the infected 
P2P hosts. Afterwards, the responsible Phagocyte performs 
the contagion-based alert propagation to spread worm alerts 
to the neighboring Phagocytes, and in turn, to other Phago- 
cytes. Here, we adopt a threshold strategy to limit the im- 
pact area and enhance the robustness against the malicious 
alert propagations generated by infected Phagocytes. Fi- 
nally, the Phagocytes help acquire the software patches and 
distribute them to the managed P2P hosts. With the above 
four modules, i.e., detection, local isolation, alert propaga- 
tion and software patching, our system is capable of pre- 
venting internal active P2P worm attacks from being effec- 
tively mounted within the P2P overlay network. 

The Phagocytes also provide the access control and fil- 
tering mechanisms for the connection establishment be- 
tween the internal P2P overlay and the external hosts. 
Firstly, the P2P traffic should be contained within the P2P 
overlay, and we forbid any P2P traffic to leak from the P2P 
overlay to external hosts. This is because such P2P traffic is 
generally considered to be malicious and it is possible that 
the P2P worms ride on such P2P traffic to spread to the ex- 
ternal hosts. Secondly, in order to prevent external worms 
from attacking the P2P overlay, we hide the P2P hosts' IP 
addresses with the help of scalable distributed DNS service, 
e.g., CoDoNS [ 20 1 . An external host who wants to gain ac- 
cess to the P2P overlay has no alternative but to perform 
an interaction towards the associated Phagocyte to solve an 
adaptive computational puzzle; then, according to the au- 
thenticity of the puzzle solution, the Phagocyte can deter- 
mine whether to process the request. 

We implement a prototype system, and evaluate its per- 
formance on a massive-scale testbed with realistic P2P net- 
work traces. The evaluation results validate the effective- 
ness and efficiency of our proposed holistic immunity sys- 
tem against active P2P worms. 

Outline. We specify the system architecture in section|2l 
Sections|3]and|4]elaborate the internal defense and external 
protection mechanisms, respectively. We then present the 
experimental design in section|5] and discuss the evaluation 




Figure 1 : System Architecture 



results in section |6] Finally, we give an overview of related 
work in section|7] and conclude this paper in section[8] 

2 System Architecture 

Current P2P overlay networks can generally be grouped 
into two categories [15|: structured overlay networks, e.g.. 
Chord II22I . whose network topology is tightly controlled 
based on distributed hash table, and unstructured overlay 
networks, e.g., Gnutella which merely impose loose 
structure on the topology. In particular, most modern un- 
structured P2P overlay networks utilize a two-tier structure 
to improve their scalability: a subset of peers, called ultra- 
peers, construct an unstructured mesh while the other peers, 
called leaf-peers, connect to the ultra-peer tier for partici- 
pating into the overlay network. 

As shown in Figure [H the network architecture of our 
system is similar to that of the two-tier unstructured P2P 
overlay networks. In our system, a set of P2P hosts act as 
the Phagocytes to perform the functions of defense and pro- 
tection against active P2P worms. These Phagocytes are 
elected among the participating P2P hosts in terms of the 
following metrics: high bandwidth, powerful processing re- 
source, sufficient uptime, and applying the latest patches 
(interestingly, the experimental result shown in section|6]in- 
dicates that we actually do not need to have a large percent- 
age of Phagocytes applying the latest patches). As exist- 
ing two-tier unstructured overlay networks do, the Phago- 
cyte election is performed periodically; moreover, even if 
an elected Phagocyte has been infected, our internal de- 
fense mechanism (described in section O can still isolate 
and patch the infected Phagocyte immediately. In particular, 
the population of Phagocytes should be small as compared 
to the total overlay population, otherwise the scalability and 
applicabiUty are questionable. 



As a result, each elected Phagocyte covers a number of 
managed P2P hosts, and each managed P2P host will belong 
to one closest Phagocyte. That is, the Phagocyte acts as 
the proxy for its managed hosts to participate into the P2P 
overlay network, and has the control over the managed P2P 
hosts. Moreover, a Phagocyte further connects to several 
nearby Phagocytes based on close proximity. 

Our main interest is the unstructured P2P overlay net- 
works, since most of the existing P2P worms target the un- 
structured overlay applications (|2|- Naturally, due to the 
similar network architecture, our system can easily be de- 
ployed into the unstructured P2P overlay networks. More- 
over, for structured P2P overlay networks, a subset of P2P 
hosts could be elected to perform the functions of Phago- 
cytes. We aim not to change the network architecture of the 
structured P2P overlay networks; however, we elect Phago- 
cytes to form an overlay to perform the defense and pro- 
tection functions — this overlay acts as a security wall in a 
separate layer from the existing P2P overlay, thus not affect- 
ing the original P2P operations. In the next two sections, we 
will elaborate in detail our mechanisms of internal defense 
and external protection against active P2P worms. 

3 Internal Defense 

In this section, we first describe the active P2P worm at- 
tacks, and then, we design our internal defense mechanism. 

3.1 Threat Model 

Generally, active P2P worms utilize the P2P overlay to 
accurately retrieve the information of a few vulnerable P2P 
hosts, and then infect these hosts to bootstrap the worm in- 
fection. On one hand, a managed P2P host clearly knows 
its associated Phagocyte and its neighboring P2P hosts that 
are managed by the same Phagocyte; so now, an infected 
managed P2P host could perform the worm infection in sev- 
eral ways simultaneously. Firstly, the infected P2P host in- 
fects its neighboring managed P2P hosts very quickly. Sec- 
ondly, the infected P2P host attempts to infect its associated 
Phagocyte. Lastly, the infected managed P2P host could 
issue P2P key queries with worms to infect many vulnera- 
ble P2P hosts managed by other Phagocytes. On the other 
hand, a Phagocyte could be infected as well; if so, the in- 
fected Phagocyte infects its managed P2P hosts and then its 
neighboring Phagocytes. As a result, in such a topological 
self-propagation way, the active P2P worms spread through 
the whole system at extraordinary speed. 

3.2 Detection 

Since the active P2P worms propagate based on the topo- 
logical information, and do not need to probe any random IP 



addresses, thus their connection failure rate should be low; 
moreover, the P2P worm attack traffic could easily blend 
into normal P2P traffic. Therefore, the active P2P worms 
do not exhibit easily detectable anomalies in the network 
traffic as normal scanning worms do. 

In our system, the Phagocytes are those elected P2P hosts 
with the latest patches, and they can help their managed P2P 
hosts detect the existence of active P2P worms by monitor- 
ing these managed hosts' connection transactions and traffic 
volume. If a managed P2P host always sends similar queries 
or sets up a large number of connections, the responsible 
Phagocyte deduces that this managed P2P host is infected. 
Another pattern the Phagocytes will monitor is to determine 
if a portion of the managed P2P hosts have some similar be- 
haviors such as issuing the similar queries, repeating to con- 
nect with their neighboring hosts, uploading/downloading 
the same files, etc., then they are considered to be infected. 

Concretely, a managed P2P host's latest behaviors are 
processed into a behavior sequence consisting of contin- 
uous {operation, payloadj^ pairs. Then, we can com- 
pute the behavior similarity between any two P2P hosts 
by using the Levenshtein edit distance ifTSl . Without 
loss of generality, we suppose that there are two be- 
havior sequences BSi and BS2, in which BSi=i^2 = 
((oa,Ki), • • • , {oij,pij), • ■ • , (o„i,Pm)), where 1 < j < 
n, and n is the length of the behavior sequence. Further, we 
can treat each behavior sequence BSi as the combination 
of the operation sequence Oi — {on, ■ ■ ■ ,Oij,--- , Om) and 
the payload sequence pi — (pn, ■ ■ ■ ,Pij, • ■ ■ ,Pin)- Now, 
we simultaneously sort the operation sequence 02 and the 
payload sequence p2 of the behavior sequence BS2 to make 
the following similarity score siTn{BSi, BS2) be maxi- 
mum. To obtain the optimal solution, we could adopt the 
maximum weighted bipartite matching algorithm |27|; how- 
ever, for efficiency, we use the greedy algorithm to obtain 
the approximate solution as an alternative. 
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Here, BS2 denotes the sorted BS2', 02j and p2j denote the 
j*^ item of the sorted 02 and p2, respectively; d{x, y) is 
the Levenshtein edit distance function. Finally, we treat 
the maximum sim{BSi, BS2) as the similarity score of the 
two behavior sequences. If the score exceeds a threshold 9d, 
we consider the two P2P hosts perform similarly. 

These detection operations are also performed between 
Phagocytes at the Phagocyte-tier because they could be in- 
fected as well though with latest patches. The infected 
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Phagocytes could perform the worm propagation rapidly; 
however, we have the local isolation, alert propagation 
and software patching procedures in place to handle these 
infected Phagocytes after detected by their neighboring 
Phagocytes with the detection module as described above. 

Note that, our detection mechanism is not a substitu- 
tion for the existing worm detection mechanisms, e.g., the 
worm signature matching [5|, but rather an effective P2P- 
tailored complement to them. Specifically, some tricky P2P 
worms may present the features of mild propagation rate, 
polymorphism, etc., so they may maliciously propagate in 
lower speed than the aggressive P2P worms; here, our soft- 
ware patching module (in section 13.5b and several existing 
schemes (|7]|26l can help mitigate such tricky worm attacks. 

Moreover, a few elaborate P2P worms, e.g., P2P- 
Worm.Win32.Hofox, have recently been reported to be able 
to kill the anti-virus/anti-worm programs on P2P hosts [|2|; 
at the system level, some local countermeasures have been 
devised to protect defense tools from being eliminated, and 
the arms race will continue. In this paper, we assume that 
P2P worms cannot disable our detection module, and there- 
fore, each Phagocyte can perform the normal detection op- 
erations as expected; so can the following modules. 

3.3 Local Isolation 

If a Phagocyte discovers that some of its managed P2P 
hosts are infected, the Phagocyte will cut off its connections 
with the infected P2P hosts, and ask these infected hosts to 
further cut off the links towards any other P2P hosts. Also, 
if a Phagocyte is detected (by its neighboring Phagocyte) 
as infected, the detecting Phagocyte immediately issues a 
message to ask the infected Phagocyte to cut off the con- 
nections towards the neighboring Phagocytes, and then to 
trigger the software patching module (in section |33] | at the 
infected Phagocyte; after the software patching, these cut 
connections should be reestablished. With the local isola- 
tion module, our system has the capacity of self-organizing 
and self-healing. We utilize the local isolation to limit the 
impact of active P2P worms as quickly as possible. 

3.4 Alert Propagation 

If a worm event has been detected, i.e., any of the man- 
aged P2P hosts or neighboring Phagocytes are detected as 
infected, the Phagocyte propagates a worm alert to all its 
neighboring Phagocytes. Further, once a Phagocyte has re- 
ceived the worm alerts from more than a threshold 9a of its 
neighboring Phagocytes, it also propagates the alert to all its 
neighboring Phagocytes that did not send out the alert. In 
general, we should appropriately tune da to limit the impact 
area and improve the robustness against the malicious alert 
propagation generated by infected Phagocytes. 



3.5 Software Patching 

The analytical study in [24J implied that the effective 
software patching is feasible for an overlay network if com- 
bined with schemes to bound the worm infection rate. In 
our system, the security patches are published to the partic- 
ipating P2P hosts using the following two procedures: 

Periodical patching: A patch distribution service pro- 
vided by system maintainers periodically pushes the latest 
security patches to all Phagocytes through the underlying 
P2P overlay, and then these Phagocytes install and distribute 
them to all their managed P2P hosts. Note that, we can uti- 
lize the periodical patching to help mitigate the tricky P2P 
worms (in section [l!2] i which are harder to be detected. 

Urgent patcliing: When a Phagocyte is alerted of a P2P 
worm attack, it will immediately pull the latest patches from 
a system maintainer via the direct HTTP connection (for ef- 
ficiency, not via the P2P overlay), and then install and dis- 
seminate them to all its managed P2P hosts. 

Specifically, each patch must be signed by the system 
maintainer |f28 |, so that each P2P host can verify the patch 
according to the signature. Note that, the zero-day vulner- 
abilities are not fictional, thus installing the latest patches 
cannot always guarantee the worm immunity. The attackers 
may utilize these vulnerabilities to perform deadly worm 
attacks. We can integrate our system with some other sys- 
tems, e.g.. Shield |26| and Vigilante [7J, to defend against 
such attacks, which can be found in l?). 

3.6 Preventing Attacks on External Hosts 

As much as possible, the Phagocytes provide the con- 
tainment of P2P worms in the P2P overlay networks. Fur- 
ther, we utilize the Phagocytes to implement the P2P traf- 
fic filtering mechanism which forbids any P2P connections 
from the P2P overlay to external hosts because such P2P 
connections are generally considered to be malicious — it is 
possible that the P2P worms ride on the P2P traffic to spread 
to the external hosts. We can safely make the assumption 
that P2P overlay traffic should be contained inside the P2P 
overlay boundary, and any leaked P2P traffic is abnormal. 
Therefore, once this leakiness is detected, the Phagocytes 
wiU perform the former procedures for local isolation, alert 
propagation and software patching. 

4 External Protection 

Our external protection mechanism aims to protect the 
P2P overlay network against the external worm attacks. We 
hide the P2P hosts' IP addresses to prevent external hosts 
from directly accessing the internal P2P resources. This 
service can be provided by a scalable distributed DNS sys- 
tem, e.g., CoDoNS 1201 . Such DNS system returns the as- 



Step 1 -> P): generate SIh, store 5/« II 5/^ II (null) 

Step 2(P H): generate k, generate Sip II SIh, Sip, k II check 

SIh, store Sip 

Step 3 (7/ P). retrieve <5/h, 5/p>, solve puzzle II 5/h, 5/p, it, 
solution, request II check <5//y, Slp>, check Sip, check solution, 
store <SIh, SIp>, process request 

where 

• SIh'. The session identifier of the external host . 

• Sip: The session identifier of the Phagocyte. 

• k: The difficulty level of the puzzle. 



Figure 2: Adaptive and Interaction-based Puzzle Scheme 

sociated Phagocyte which manages the requested P2P host. 
Then, the Phagocyte is able to adopt our following proposed 
computational puzzle scheme to perform the function of ac- 
cess control over the requests issued by the requesting ex- 
ternal host. 

4.1 Adaptive & Interaction-based Puzzle 

We propose a novel adaptive and interaction-based com- 
putational puzzle scheme at the Phagocytes to provide the 
access control over the possible external worms attacking 
the internal P2P overlay. 

Since we are interested in how messages are processed 
as well as what messages are sent, for clarity and simplic- 
ity, we utilize the annotated Alice-and-Bob specification to 
describe our puzzle scheme. As shown in Figured to gain 
access to the P2P overlay, an external host has to perform 
a lightweight interaction towards the associated Phagocyte 
to solve an adaptive computational puzzle; then, according 
to the authenticity of the puzzle solution, the Phagocyte can 
determine whether to process the request. 

Step 1. The external host H first generates a 64-bit 
nonce Nh as its session identifier SIh- Then, the exter- 
nal host stores SIh and sends it to the Phagocyte. 

Step 2. On receiving the message consisting of SIh sent 
by the host H, the Phagocyte P adaptively adjusts the puz- 
zle difficulty k based on the following two real-time statuses 
of the network environment. 

• Status of Phagocyte: This status indicates the usage 
of the Phagocyte's resources, i.e., the ratio of consumed re- 
sources to total resources possessed by the Phagocyte. The 
more resources a Phagocyte has consumed, the harder puz- 
zles the Phagocyte issues in the future. 

• Status of external host: In order to mount attacks 
against P2P hosts effectively, malicious external hosts have 
no alternative but to perform the interactions and solve 
many computational puzzles. That is, the more connections 
an external host tries to establish, the higher the probabil- 
ity that this activity is malicious and worm-like. Hence, 
the more puzzles an external host has solved in the recent 



period of time, the harder puzzles the Phagocyte issues to 
the very external host. Note that, since a malicious external 
host could simply spoof its IP address, in order to effectively 
utilize the status of external host, our computational puzzle 
scheme should have the capability of defending against IP 
spoofing attacks, which we will describe later 

Subsequently, the Phagocyte P simply generates a 
unique 64-bit session identifier Sip for the external host 
according to the host's IP address IPh (extracted from the 
IP header of the received message), the host's session iden- 
tifier SIh and the puzzle difficulty k, as follows: 

Sip = H MAC secret {IPH\SlH\k) (2) 

Here, the HMAC is a keyed hash function for message 
authentication, and the secret is a 32-bit key which is pe- 
riodically changed and only known to the Phagocyte itself. 
Such secret limits the time external hosts have for com- 
puting puzzle solutions, and it also guarantees that an ex- 
ternal attacker usually does not have enough resources to 
pre-compute all possible solutions in step 3. 

After the above generation process, the Phagocyte 
replies to the external host at IPh with the host's session 
identifier SIh, the Phagocyte's session identifier Sip and 
the puzzle difficulty k. Once the external host has received 
this reply message, it first checks whether the received SIh 
is really generated by itself. If the received SIh is bogus, 
the external host simply drops the message; otherwise, the 
host stores the Phagocyte's session identifier Sip immedi- 
ately. Such reply and checking operations can effectively 
defend against IP spoofing attacks. 

Step 3. The external host H retrieves the {SIh, Sip) 
pair as the global session identifier, and then tries to solve 
the puzzle according to the equation below: 

h{SlH,SIp,X) = Y^''^ (3) 

Here, the ft, is a cryptographic hash function, the F''"'' is 
a hash value with the first k bits being equal to 0, and the 
X is the puzzle solution. Due to the features of hash func- 
tion, the external host has no way to figure out the solution 
other than brute-force searching the solution space until a 
solution is found, even with the help of many other solved 
puzzles. The cost of solving the puzzle depends exponen- 
tially on the difficulty k, which can be effortlessly adjusted 
by the Phagocyte. 

After the brute-force computation, the external host 
sends the Phagocyte a message including the global session 
identifier (i.e., the {SIh, Sip) pair), the puzzle difficulty, 
the puzzle solution and the actual request. Once the Phago- 
cyte has received this message, it performs the following 
operations in turn: 

a) Check whether the session identifier {SIh , Sip) is 
really fresh based on the database of the past global session 



identifiers. This operation can effectively defend against 
replay attacks. 

b) Check whether the Phagocyte's session identifier Sip 
can be correctly generated according to equation|2] Specifi- 
cally, this operation can additionally check whether the dif- 
ficulty level k reported by the external host is the original k 
determined by the Phagocyte. 

c) Check whether the puzzle solution is correct accord- 
ing to equation[3] which will also not incur significant over- 
head on the Phagocyte. 

d) Store the global session identifier {SIh, Sip), and act 
as the overlay proxy to transmit the request submitted by 
the external host. Note that, in our scheme, the Phagocyte 
stores the session-specific data and processes the actual re- 
quest only after it has verified the external host's puzzle so- 
lution. That is, the Phagocyte does not commit its resources 
until the external host has demonstrated the sincerity. 

Specifically in the above sequence of operations, if one 
operation succeeds, the Phagocyte continues to perform the 
next; otherwise, the Phagocyte cancels all the following op- 
erations, and the entire interaction ends. More details about 
the puzzle design rationale can be found in iH). 

4.2 Comparison and Analysis 

So far, several computational puzzle schemes ll8l [T0l[T6l 
have been proposed. However, most of them consider only 
the status of resource providers, so they cannot reflect the 
network environment completely. Recently, an ingenious 
puzzle scheme. Portcullis 1 17|, was proposed. In Portcullis, 
since a resource provider gives priority to requests contain- 
ing puzzles with higher difficulty levels, to gain access to 
the requested resources, each resource requester, no matter 
legitimate or malicious, has to compete with each other and 
solve hard puzzles under attacks. This may influence legiti- 
mate requesters' experiences significantly. 

Compared with existing puzzle schemes, our adap- 
tive and interaction-based computational puzzle scheme 
satisfies the fundamental properties of a good puzzle 
scheme ifTOl . It treats each external host distinctively by 
performing a lightweight interaction to flexibly adjust the 
puzzle difficulty according to the real-time statuses of the 
network environment. This guarantees that our computa- 
tional puzzle scheme does not influence legitimate external 
hosts' experiences significantly, and it also prevents a mali- 
cious external host from attacking P2P overlay without in- 
vesting unbearable resources. 

In real-world networks, hosts' computation capabilities 
vary a lot, e.g., the time to solve a puzzle would be much 
different between a host with multiple fast CPUs and a host 
with just one slow CPU. To decrease the computational dis- 
parity, some other kinds of puzzles, e.g., memory-bound 
puzzle 13], could be complementary to our scheme. Note 



that, with low probability, a Phagocyte may also be com- 
promised by external worm attackers, then they could per- 
form the topological worm propagation; here, our proposed 
internal defense mechanism could be employed to defend 
against such attacks. 

5 Experimental Design 

In our experiments, we first implement a prototype sys- 
tem, and then construct a massive-scale testbed to verify the 
properties of our prototype system. 

5.1 Prototype System 

Internal Defense. We implement an internal defense 
prototype system including all basic modules described in 
section|3] Here, a Phagocyte monitors each of its connected 
P2P hosts' latest 100 requests. Firstly, if more than half 
of the managed P2P hosts perform similar behaviors, the 
responsible Phagocyte considers that the managed zone is 
being exploited by worm attackers. Secondly, if more than 
half of a Phagocyte's neighboring Phagocytes perform the 
similar operations, the Phagocyte considers its neighboring 
Phagocytes are being exploited by worm attackers. In par- 
ticular, the similarity is measured based on the equation [T] 
with a threshold 6d of 0.5. Then, in the local isolation mod- 
ule, if a Phagocyte has detected worm attacks, the Phago- 
cyte will cut off the associated links between the infection 
zone and the connected P2P hosts. Afterwards, in the alert 
propagation module, if a Phagocyte has detected any worm 
attacks, it will broadcast a worm alert to all its neighboring 
Phagocytes; further, if a Phagocyte receives more than half 
of its neighboring Phagocytes' worm alerts, i.e., 9a > 0.5, 
the Phagocyte will also broadcast the alert to all its neigh- 
boring Phagocytes that did not send out the alert. Finally, in 
the software patching module, the Phagocytes acquire the 
patches from the closest one of the system maintainers (i.e., 
100 online trusted Phagocytes in our testbed), and then dis- 
tribute them to all their managed P2P hosts. We have not yet 
integrated the signature scheme into the software patching 
module of our prototype system. 

Note that, in the above, we simply set the parameters 
used in our prototype system, and in real- world systems, the 
system designers should appropriately tune these parame- 
ters according to their specific requirements. 

External Protection. We utilize our adaptive and 
interaction-based computational puzzle module to develop 
the external protection prototype system. In this proto- 
type system, we use SHAl as the cryptographic hash func- 
tion. Generally, solving a puzzle with difficulty level k will 
force an external host to perform 2''^^ SHAl computations 
on average. In particular, the difficulty level k varies be- 
tween and 26 in our system — this will cost an exter- 



Table 1 : Network Traces of Gnutella 





Trace 1 


Trace 2 


Trace 3 


Trace 4 


Trace 5 


Trace 6 


Number (abbr., #) of Phagocytes (Ultra-peers) 


158,985 


209, 723 


51,400 


51,400 


51,400 


14, 705 


# of managed P2P hosts (Leaf-peers) 


717,025 


1,026,231 


512,448 


342, 757 


257, 080 


73, 539 


# of Phagocytes : # of managed P2P hosts 


22.17% 


20.44% 


10.03% 


15.00% 


19.99% 


20.00% 


# of Phagocytes : # of all P2P hosts 


18.15% 


16.97% 


9.12% 


13.04% 


16.66% 


16.66% 



nal host 0.0 second (k = 0) to 24.728 seconds (k = 26) 
on our POWERS CPUs. In addition, the change cycle of 
the puzzle-related parameters is set to 5 minutes. Yet, we 
have not integrated our prototype system with the scalable 
distributed DNS system, and this work will be part of our 
future work. 

5.2 Testbed Construction 

We use the realistic network traces crawled from a 
million-node Gnutella network by the Cruiser f23\ crawler 
The dedicated massive scale Gnutella network is composed 
of two tiers including the ultra-peer tier and leaf-peer tier 
For historical reasons, the ultra-peer tier consists of not only 
modern ultra-peers but also some legacy-peers that reside in 
the ultra-peer tier but cannot accept any leaf -peers. Specif- 
ically, in our experiments, the ultra-peers excluding legacy- 
peers perform the functions of Phagocytes, and the leaf- 
peers act as the managed P2P hosts. 

Then, we adopt the widely accepted GT-ITM fyi\ to gen- 
erate the transit-stub model consisting of 10, 047 routers for 
the underlying hierarchical Internet topology. There are 10 
transit domains at the top level with an average of 10 routers 
in each, and a link between each pair of these transit routers 
has a probability of 0.5. Each transit router has an average 
of 10 stub domains attached, and each stub has an average 
of 10 routers, with the link between each pair of stub routers 
having a probability of 0.1. There are two million end-hosts 
uniformly assigned to routers in the core by local area net- 
work (LAN) links. The delay of each LAN link is set to 
5ms and the average delay of core links is 40ms. 

Now, the crawled Gnutella networks can model the real- 
istic P2P overlay, and the generated GT-ITM network can 
model the underlying Internet topology; thus, we deploy 
the crawled Gnutella networks upon the underlying Inter- 
net topology to simulate the realistic P2P network environ- 
ment. We do not model queuing delay, packet losses and 
any cross network traffic because modeling such parame- 
ters would prevent the massive-scale network simulation. 

As shown in Table [T] we list various Gnutella traces that 
we use in our experiments — with different node popula- 
tions and/or different percentages of Phagocytes. 

• Trace 1: Crawled by Cruiser on Sep. 27th, 2004. 

• Trace 2: Crawled by Cruiser on Feb. 2nd, 2005. 

• Trace 3: Based on trace 1, we remove a part of Phago- 
cytes randomly; then, we remove the isolated Phagocytes, 



i.e., these Phagocytes do not connect to any other Phago- 
cytes; finally, we further remove the isolated managed P2P 
hosts, i.e., these managed P2P hosts do not connect to any 
Phagocytes. 

• Trace 4: Based on trace 3, we remove a part of man- 
aged P2P hosts randomly. 

• Trace 5: Based on trace 4, we further remove a part of 
managed P2P hosts randomly. 

• Trace 6: Based on trace 1, we use the same method as 
described in the generation process of trace 3. In addition, 
we remove an extra part of managed P2P hosts. 

6 Evaluation Results 

6.1 Performance Metrics 

In our experiments, we characterize the performance un- 
der various different circumstances by using three metrics: 

• Peak infection percentage of all P2P hosts: The ratio 
of the maximum number of infected P2P hosts to the total 
number of P2P hosts. This metric indicates whether Phago- 
cytes can effectively defend against internal attacks. 

• Blowup factor of latency: This factor is the latency 
penalty between the external hosts and the P2P overlay via 
the Phagocytes and direct routing. This indicates the effi- 
ciency of our Phagocytes to filter the requests from external 
hosts to the P2P overlay. 

• Percentage of successful external attacks: The ratio of 
the number of successful external attacks to the total num- 
ber of external attacks. This metric indicates the effective- 
ness of our Phagocytes to prevent external hosts from at- 
tacking the P2P overlay. 

6.2 Internal Defense Evaluation 

In our prototype system, we model a percentage of 
Phagocytes and managed P2P hosts being initially immune, 
respectively; except these immune P2P hosts, the other 
hosts are vulnerable. Moreover, there are a percentage 
of P2P hosts being initially infected, which are distributed 
among these vulnerable Phagocytes and vulnerable man- 
aged P2P hosts uniformly at random. All the infected P2P 
hosts perform the active P2P worm attacks (described in 
section [TT]), and meanwhile, our internal defense modules 
deployed at each participant try to defeat such attacks. With 



Table 2: Experimental Parameters (Internal Defense) 





Experiment 1 


Expeiiment 2 


Experiment 3 


Experiment 4 


Initial percentage (abbr., %) of immune Phagocytes 


100% to 50% 


95% 


95% 


95% 


Initial % of immune managed P2P hosts 


10% 


0% to 30% 


10% 


10% 


Initial infection % of all vulnerable P2P hosts 


10-^% to 50% 


10-^% to 50% 


10~^% to 50% 


10-^% to 50% 


Used traces 
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3,4,5 
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different experimental parameters described in Table |2] we 
conduct four different experiments to evaluate the internal 
defense mechanism. 

Experiment 1 — Impact of immune Phagocytes: With 
seven different initial percentages of immune Phagocytes, 
we fix the initial percentage of immune managed P2P hosts 
to 10%, and vary the number of initial infected P2P hosts 
so that these infected hosts make up between 10^'^% and 
50% of all the vulnerable P2P hosts. Now, we can inves- 
tigate the impact of immune Phagocytes by calculating the 
peak infection percentage of all P2P hosts. The experimen- 
tal result shown in Figure[3]demonstrates that when the ini- 
tial infection percentage of all vulnerable P2P hosts is low 
(e.g., < 1%), the Phagocytes can provide a good contain- 
ment of active P2P worms; otherwise, the worm propaga- 
tion is very fast, but the Phagocytes could still provide the 
sufficient containment — this property is also held in the 
following experiments. Interestingly, the initial percentage 
of immune Phagocytes does not influence the performance 
of our system significantly, i.e., the percentage of Phago- 
cytes being initially immune has no obvious effect. This is 
a good property because we do not actually need to have 
high initial percentage of immune Phagocytes. Also, this 
phenomenon impUes that increasing the number of immune 
Phagocytes does not further provide much significant de- 
fense. Thus, we can clearly conclude that the Phagocytes 
are effective and scalable in performing detection, local iso- 
lation, alert propagation and software patching. 

Experiment 2 — Impact of immune managed P2P 
hosts: In this experiment, for 95% of Phagocytes being ini- 
tially immune, we investigate the performance of our sys- 
tem with various initial percentages of immune managed 
P2P hosts in steps of 10%. The result shown in Figure|4]is 
within our expectation. The peak infection percentage of aU 



P2P hosts decreases with the growth of the initial percent- 
age of immune managed P2P hosts. Actually, in real- world 
overlay networks, even a powerful attacker could initially 
control tens of thousands of overlay hosts (1% - 5% in the 
X-axis); hence, we conclude that our Phagocytes have the 
capacity of defending against active P2P worms effectively 
even in a highly malicious environment. 

Experiment 3 — Impact of network scale: Figure |5] 
plots the performance of our system in terms of different 
network scales. In traces 1, 2, 5 and 6, there are different 
node populations, but the ratios of the number of Phago- 
cytes to the number of all P2P hosts are all around 17%. The 
experimental result indicates that our system can indeed 
help defend against active P2P worms in various overlay 
networks with different network scales. Furthermore, al- 
though the Phagocytes perform more effectively in smaller 
overlay networks (e.g., traces 5 and 6), they can still work 
quite well in massive-scale overlay networks with million- 
node participants (e.g., traces 1 and 2). 

Experiment 4 — Impact of the percentage of Phago- 
cytes: In our system, the Phagocytes perform the functions 
of defending against P2P worms. In this experiment, we 
evaluate the system performance with different percentages 
of Phagocytes but the same number of Phagocytes. The 
result in Figure |6] indicates that the higher percentage of 
Phagocytes, the better security defense against active P2P 
worms. That is, as the percentage of Phagocytes increases, 
we can persistently improve the security capability of de- 
fending against active P2P worms in the overlay network. 
Further, the experimental result also implies that we do not 
need to have a large number of Phagocytes to perform the 
defense functions — around 10% of the node population 
functioning as Phagocytes is sufficient for our system to 
provide the effective worm containment. 




Blowup Factor of Latency 



Figure 7: Cumulative distribution of 
the latency penalty between external 
hosts and P2P overlay via the Phago- 
cytes and direct routing. 




Figure 8: Cumulative distribution of 
the absolute latency difference be- 
tween external hosts and P2P overlay 
via the Phagocytes and direct routing. 




Figure 9: Effectiveness of protection 
against external worms attacking the 
P2P overlay. 



6.3 External Protection Evaluation 

In this section, we conduct two more experiments in our 
prototype system to evaluate the performance of the exter- 
nal protection mechanism. 

Experiment 5 — Efficiency: In this experiment, we 
show the efficiency in terms of the latency penalty between 
the external hosts and the P2P overlay via the Phagocytes 
and direct routing. Based on trace 1, we have 100 external 
hosts connect to every P2P host via the Phagocytes and di- 
rect routing in turn. Then, we measure the latencies for both 
cases. 

Figure |2]plots the measurement result of latency penalty. 
We can see that, if routing via the Phagocytes, about 30% 
and 80% of the connections between the external hosts and 
P2P hosts have the blowup factor of latency be less than 
2 and 2.5, respectively. Figure |8] shows the corresponding 
absolute latency difference, from which we can further de- 
duce that the average latency growth of more than half of 
these connections (via the Phagocytes) is less than 150ms. 
Actually, due to the interaction required by our proposed 
computational puzzle scheme, we would expect some la- 
tency penalty incurred by routing via the Phagocytes. With 
the puzzle scheme, our system can protect against external 
attacks effectively which we will illustrate in the next ex- 
periment. Hence, there would be a tradeoff between the 
efficiency and effectiveness. 

Experiment 6 — Effectiveness: In this experiment, 
based on trace 1, we have 100 external worm attackers flood 
all Phagocytes in the P2P overlay. Then, we evaluate the 
percentage of successful external attacks to show the effec- 
tiveness of our protection mechanism against external hosts 
attacking the P2P overlay. For other numbers of external 
worm attackers, we obtain the similar experimental results. 

In Figure |9] the X-axis is the attack frequency in terms 
of the speed of external hosts mounting worm attacks to 
the P2P overlay, and the Y-axis is the percentage of suc- 
cessful external attacks. The result clearly illustrates the 
effectiveness of Phagocytes in protecting the P2P overlay 



from external worm attacks. Our adaptive and interaction- 
based computational puzzle module at the Phagocytes plays 
an important role in contributing to this observation. Even 
in an extremely malicious environment, our system is still 
effective. That is, to launch worm attacks, the external at- 
tackers have no alternative but to solve hard computational 
puzzles which will incur heavy burden on these attackers. 
From the Figure |9] we can also find that when the attack 
frequency decreases, the percentage of successful external 
attacks increases gradually. However, with a low attack fre- 
quency, the attackers cannot perform practical attacks. Even 
if a part of external attacks are mounted successfully, our in- 
ternal defense mechanism can mitigate them effectively. 

7 Related Work 

P2P worms could exploit the perversive P2P overlays 
to achieve fast worm propagation, and recently, many P2P 
worms have already been reported to employ real-world 
P2P systems as their spreading platforms |2, 19, 21 1. The 
very first work in lf32l highlighted the dangers posed by P2P 
worms and studied the feasibility of self-defense and con- 
tainment inside the P2P overlay. Afterwards, several stud- 
ies ||6] [19 1 developed mathematical models to understand 
the spreading behaviors of P2P worms, and showed that P2P 
worms, especially the active P2P worms, indeed pose more 
deadly threats than normal scanning worms. 

Recognizing such threats, many researchers started to 
study the corresponding defense mechanisms. Specifically, 
Yu et al. in |20 | presented a region-based active immuniza- 
tion defense strategy to defend against active P2P worm at- 
tacks; Freitas et al. in |9 1 utilized the diversity of participat- 
ing hosts to design a worm-resistant P2P overlay, Verme, for 
containing possible P2P worms; moreover, in [29 1, Xie and 
Zhu proposed a partition-based scheme to proactively block 
the possible worm spreading as well as a connected domi- 
nating set based scheme to achieve fast patch distribution in 
a race with the worm, and in [28], Xie et al. further designed 
a P2P patching system through file-sharing mechanisms to 



internally disseminate security patches. However, existing 
defense mechanisms generally focused on the internal P2P 
worm defense without the consideration of external worm 
attacks, so that they cannot provide a total worm protection 
for the P2P overlay systems. 

8 Conclusion 

In this paper, we have addressed the deadly threats posed 
by active P2P worms which exploit the pervasive and pop- 
ular P2P applications for rapid topological worm infection. 
We build an immunity system that responds to the active 
P2P worm infection by using Phagocytes. The Phagocytes 
are a small subset of specially elected P2P hosts that have 
high immunity and can "eat" active P2P worms in the P2P 
overlay networks. Each Phagocyte manages a group of P2P 
hosts by monitoring their connection patterns and traffic 
volume. If any worm events are detected, the Phagocyte 
will invoke the internal defense strategies for local isola- 
tion, alert propagation and software patching. Besides, the 
Phagocytes provide the access control and filtering mech- 
anisms for the communication establishment between the 
P2P overlay and external hosts. The Phagocytes forbid the 
P2P traffic to leak from the P2P overlay to external hosts, 
and further adopt a novel adaptive and interaction-based 
computational puzzle scheme to prevent external hosts from 
attacking the P2P overlay. To sum up, our holistic immunity 
system utilizes the Phagocytes to achieve both internal de- 
fense and external protection against active P2P worms. We 
implement a prototype system and validate its effectiveness 
and efficiency in massive-scale P2P overlay networks with 
realistic P2P network traces. 
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